Coreference Clustering using Column Generation
نویسندگان
چکیده
In this paper we describe a novel way of generating an optimal clustering for coreference resolution. Where usually heuristics are used to generate a document-level clustering, based on the output of local pairwise classifiers, we propose a method that calculates an exact solution. We cast the clustering problem as an Integer Linear Programming (ILP) problem, and solve this by using a column generation approach. Column generation is very suitable for ILP problems with a large amount of variables and few constraints, by exploiting structural information. Building on a state of the art framework for coreference resolution, we implement several strategies for clustering. We demonstrate a significant speedup in time compared to state-ofthe-art approaches of solving the clustering problem with ILP, while maintaining transitivity of the coreference relation. Empirical evidence suggests a linear time complexity, compared to a cubic complexity of other methods.
منابع مشابه
A Hierarchical Distance-dependent Bayesian Model for Event Coreference Resolution
We present a novel hierarchical distancedependent Bayesian model for event coreference resolution. While existing generative models for event coreference resolution are completely unsupervised, our model allows for the incorporation of pairwise distances between event mentions — information that is widely used in supervised coreference models to guide the generative clustering processing for be...
متن کاملCorefrence resolution with deep learning in the Persian Labnguage
Coreference resolution is an advanced issue in natural language processing. Nowadays, due to the extension of social networks, TV channels, news agencies, the Internet, etc. in human life, reading all the contents, analyzing them, and finding a relation between them require time and cost. In the present era, text analysis is performed using various natural language processing techniques, one ...
متن کاملNoun Phrase Coreference as Clustering
This paper introduces a new, unsupervised algorithm for noun phrase coreference resolution. It differs from existing methods in that it views coreference resolution as a clustering task. In an evaluation on the MUC-6 coreference resolution corpus, the algorithm achieves an F-measure of 53.6%, placing it firmly between the worst (40%) and best (65%) systems in the MUC-6 evaluation. More importan...
متن کاملCross-document Coreference for WePS
A good clustering performance depends on the quality of the distance function used to asses similarity. In this paper we propose a pairwise document coreference model to improve performance over a wordvector similarity approach for the WePS 3 clustering task. We identify a simple criterion which discriminates between highly ambiguous queries, i.e. many small clusters, and balanced queries, i.e....
متن کاملMachine Learning Approaches to Coreference Resolution
This paper introduces three machine learning approaches to noun phrase coreference resolution. The first of them gives a view of coreference resolution as a clustering task. The second one applies a noun phrase coreference system based on decision tree induction and the last one experiments with using the Bell tree to represent the search space of the coreference resolution problem. The knowled...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012